Improved Discovery of Molecular Interactions in Genome-Scale Data with Adaptive Model-Based Normalization

نویسندگان

  • Julia Salzman
  • Daniel M. Klass
  • Patrick O. Brown
چکیده

BACKGROUND High throughput molecular-interaction studies using immunoprecipitations (IP) or affinity purifications are powerful and widely used in biology research. One of many important applications of this method is to identify the set of RNAs that interact with a particular RNA-binding protein (RBP). Here, the unique statistical challenge presented is to delineate a specific set of RNAs that are enriched in one sample relative to another, typically a specific IP compared to a non-specific control to model background. The choice of normalization procedure critically impacts the number of RNAs that will be identified as interacting with an RBP at a given significance threshold - yet existing normalization methods make assumptions that are often fundamentally inaccurate when applied to IP enrichment data. METHODS In this paper, we present a new normalization methodology that is specifically designed for identifying enriched RNA or DNA sequences in an IP. The normalization (called adaptive or AD normalization) uses a basic model of the IP experiment and is not a variant of mean, quantile, or other methodology previously proposed. The approach is evaluated statistically and tested with simulated and empirical data. RESULTS AND CONCLUSIONS The adaptive (AD) normalization method results in a greatly increased range in the number of enriched RNAs identified, fewer false positives, and overall better concordance with independent biological evidence, for the RBPs we analyzed, compared to median normalization. The approach is also applicable to the study of pairwise RNA, DNA and protein interactions such as the analysis of transcription factors via chromatin immunoprecipitation (ChIP) or any other experiments where samples from two conditions, one of which contains an enriched subset of the other, are studied.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I-40: Male Genome Programming, Infertility and Cancer

Background: During male germ cells differentiation, genomewide re-organizations and highly specific programming of the male genome occur. These changes not only include the large-scale meiotic shuffling of genes, taking place in spermatocytes, but also a complete “re-packaging” of the male genome in post meiotic cells, leading to a highly compacted nucleo-protamine structure in the mature sperm...

متن کامل

Statistical Wavelet-based Image Denoising using Scale Mixture of Normal Distributions with Adaptive Parameter Estimation

Removing noise from images is a challenging problem in digital image processing. This paper presents an image denoising method based on a maximum a posteriori (MAP) density function estimator, which is implemented in the wavelet domain because of its energy compaction property. The performance of the MAP estimator depends on the proposed model for noise-free wavelet coefficients. Thus in the wa...

متن کامل

HLA-KIR Interactions and Immunity to Viral Infections

Host genetic factors play a central role in determining the clinical phenotype of human diseases. Association between two polymorphic loci in human genome, human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptors (KIRs), and genetically complex infectious disease, particularly those of viral etiology, have been historically elusive. Hence, defining the influence of genetic di...

متن کامل

Normalization of qPCR array data: a novel method based on procrustes superimposition

MicroRNAs (miRNAs) are short, endogenous non-coding RNAs that function as guide molecules to regulate transcription of their target messenger RNAs. Several methods including low-density qPCR arrays are being increasingly used to profile the expression of these molecules in a variety of different biological conditions. Reliable analysis of expression profiles demands removal of technical variati...

متن کامل

Predicting Survival of Patients with Lung Cancer Using Improved Adaptive Neuro-Fuzzy Inference System

Introduction: Lung cancer is the main cause of mortality in both genders worldwide. This disease is caused by the uncontrollable growth and development of cells in both or one of the lungs. Although the early diagnosis of this cancer is not an easy task, the earlier it is diagnosed, the higher will be the chance of treating. The objective of this study was to develop an optimized prediction mod...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2013